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^ , Abstract. A variant of the classical optimal transportation prob- 

Oh' lem is: among all joint measures with fixed marginals and which 

•^^ , are dominated by a given density, find the optimal one. Exis- 

tence and uniqueness of solutions to this variant were established 
rvq , in KM11 . In the present manuscript, we expose an unexpected 

symmetry leading to the first explicit examples in two and more 
dimensions. These are inspired in part by simulations in one di- 
mension which display singularities and topology and in part by 
two further developments: the identification of all extreme points 

C| ' in the feasible set, and a new approach to uniqueness based on 

"£i , constructing feasible perturbations. 
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;> ! 1. Introduction 

lO ' Given fixed distributions of supply and demand, the optimal trans- 

portation problem of Monge [M08I] and Kantorovich |K42] involves 
pairing supply with demand so as to minimize the average transporta- 
tion cost c(x, y) between each supplier x and the demander y with 
whom x is paired. For continuous distributions, this question forms 
an (the?) archetypal example of an infinite-dimensional linear pro- 
gram. Its relevance to the physics of fluids has been recognized since 

S^ \ the work of Brenier [B87] and Cullen and Purser |CP89j . while some of 

its applications to geometry, dynamics, partial differential equations, 
economics and statistics are described in |MG10] |RR98] |V09] and the 
references there. It is desirable to introduce congestion effects into this 
model, as can be attempted in various ways |CJS08j ; one of the crudest 
is simply to bound the number of suppliers at x who can be paired with 
demanders at y, for each x and y. Despite its appeal, for continuous 
distributions of supply and demand, this variant seems not to have 
been studied until |KMllj . 
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As in all linear programs, if the problem has solutions, at least one 
of them will be an extreme point of the feasible set. A remaining chal- 
lenge in the Monge-Kantorovich transportation problem is to arrive 
at a characterization of the extreme points which yields useful infor- 
mation about the geometry and topology of its solutions [AKMllj . 
Somewhat surprisingly, such a characterization is much more accessi- 
ble in our capacity constrained variant; as shown below, it can basically 
be reduced to a 'bang-bang' (all or nothing) principle. As a corollary, 
this characterization implies the uniqueness of solutions first estab- 
lished in |KMllj . Moreover, it combines with elementary but obscure 
symmetries to yield the first explicitly soluble examples in more than 
one dimension, and with numerical and theoretical considerations to 
give insights into the geometry and topology aspects of basic examples 
which — even in one-dimension — still defy explicit solution. 

The problem in question is formulated precisely as follows: Given 
densities < /, g G L 1 (R d ) with same total mass J f = J g, let r(/, g) 
denote the set of joint densities < h G L 1 (R d x R d ) which have / 
and g as their marginals, meaning 

f(x)= / h(x,y)dy 

jR d 



h(x,y)dx = g(y) 

*R d 

for Lebesgue almost all i,i/G R d . A bounded function c(x, y) repre- 
sents the cost per unit mass for transporting material from x G R d to 
y G R d . The (total) transportation cost of h is denoted c[h], defined 
by 

(1) c[h] := / c(x,y)h(x,y)dxdy, 

</R d xR d 

is proportional to the expected value of c with respect to h. 

Given < h G L 1 (R d xR d ), we let T(f,g)~ h denote the set of all 
h G T(f,g) dominated by h, that is h < h almost everywhere. The 
optimization problem we are concerned with — optimal transport with 
capacity constraints — is to minimize the transportation cost ([[]) among 
joint densities h in T(f,g) h , to obtain the optimal cost 

(2) min c[h] 

her(/,3)R 

under the capacity constraint h. 

Notice this problem involves a linear minimization on a convex set 
T = T(f,g) h , and therefore takes the form of an infinite-dimensional 
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linear program. When T is non-empty, it is not hard to show that the 
minimum is attained, and that at least one of the minimizers h is an 
extreme point of T, meaning h is not the midpoint of any segment in 
T. Since the possible facet structure of T is not obvious, it is harder to 
determine whether or not this minimizer is unique. A sufficient con- 
dition for uniqueness was discovered in |KM11] . and is recalled below. 
It is even harder to envision what the solutions will look like. Perhaps 
the simplest example involves pairing Lebesgue measure on the unit 
interval with itself so as to minimize the quadratic transportation cost 
c(x,y) = \x — y\ 2 (well-known to be equivalent to c(x,y) = —x ■ y, 
as in the unconstrained problem |B87j ). As the capacity constraint 
h is varied over different constant values, the numerical solutions be- 
low display an unexpected variety of strange topological features and 
analytic singularities begging to be understood. Even though supply 
equals demand in these examples, and the constraints permit at least 
some of the demand to be supplied locally at zero cost, the islands of 
blue in these diagrams show that global optimality may require there 
to be regions where none of the demand is supplied locally. 

In this paper we establish symmetries which explain at least some 
of the observed structures (Lemma 14. ip . These also lead to the first 
explicit examples of optimizers in higher dimensions (Proposition I4.2p . 
We precede this with a simple description of the extreme points of T, 
based on a perturbation argument. The same idea is also used to sub- 
stantially simplify the uniqueness argument from |KM11] . The original 
argument relied on understanding the infinitesimal behaviour of an op- 
timizer near its Lebesgue points in order to argue that any optimizer is 
geometrically extreme — a property which characterizes extreme points 
of T(f,g) h (see proposition I3.2J) . The new proof begins from an 'all 
or nothing' characterization of extreme points and uses perturbations 
to argue directly - - without asymptotics or blow ups - - that every 
optimizer is extreme. Uniqueness follows easily. 

Acknowledgements. We would like to thank Brian Wetton for shar- 
ing the figures and the MATLAB code that generated them with us; 
these simulations inspired Lemma [4.11 

2. Assumptions 

We make the following assumptions throughout (see |KM11] for more 
details). 

2.1. Assumptions on the cost. 

(CI) c(x, y) is bounded, 
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(C2) there is a Lebesgue negligible closed set Z C R d x H d such that 

c(x, y) G C 2 (R d xR d \Z) and, 
(C3) c(x,y) is non-degenerate: det[D^i jc(x,y)] 7^ for all (x,y) G 

R d x R d \ Z. 

2.2. Assumption on the capacity constraint, h is non-negative 
and Lebesgue integrable: < h G L 1 (R d x H d ). 

Given marginal densities < /, g G L 1 (R d ) with same total mass, 
to avoid talking about the trivial case, we will always assume that a 
feasible solution exists: T = T(f,g) h ^ 0. 

3. Uniqueness: every optimizer is extreme 

Our arguments are based crucially on a preparatory lemma from real 
analysis. 

Lemma 3.1 (Marginal-preserving volume exchange). Fix A = (1, . . . , 1) 
R d . If a Lebesgue set U C R d x H d is non-negligible, then for all 5 > 
sufficiently small there is a subset V C U of positive volume such that 
(x + SA, y), (x, y + 5 A) and (x + 5 A, y + 5 A) all belong to U whenever 
(x,y) G V. Moreover, V may be taken to lie in the interior of a coordi- 
nate hypercube of side-length 5. The vertex of this hypercube at which 
A is an outward normal may be chosen to lie at any Lebesgue point zq 
where U has full density. (V may be chosen to have Lebesgue density 
1/2 M at z Q .) 

Proof. Let / = [0, 1] denote the unit interval and I d the hypercube, so 
that Hij = (2i — l)I d x (2j — l)I d for i,j G {0, 1} define four hypercubes 
with disjoint interiors in 2d dimensions. Let Zq be a Lebesgue point 
where U has full density; we may suppose z is the origin without loss 
of generality. Letting 5~ 1 U denote the dilation of U around z$ by factor 
d^ 1 , we see the fraction of H^ outside of S~ 1 U tends to zero as 5 — > 0. 
For S sufficiently small, we may assume all four of these fractions to be 
strictly less than 1/4. Let V C H 00 be the set of (x,y) G H 00 fl 5^ 1 U 
for which (x + A,y), (x,y + A) and (x + A,y + A) also belong to 5^ 1 U. 
If (x, y) G H 00 \ V, it is because at least one of the four points above 
does not belong to 5^ 1 U. Thus H 00 \ V = J 00 U J i U ^io U Jn where 
each of the four sets Jij + (iA,jA) := H^ \ 5 _1 f/ has volume strictly 
less than 1/4. Thus C 2d [HoQ \V] < 1, implying V is a set of positive 
measure. Discarding from V any points on the boundary of Hqq and 
contracting by a factor 5 yields the lemma. (The parenthetical remark 
is obtained by noting 1/4 is arbitrary in the argument above; taking S 
smaller forces the volume of H 00 \ V to be as small as we please. Thus 
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V fills a larger and larger fraction of the hypercube Hqq near its vertex, 
where both H 00 and hence V have Lebesgue density 1/2 M .) D 

This allows us to give a much nicer characterization of the extreme 
points of T(f,g) h than any available for the unconstrained problem 
(h = +oo) jAKMllj . 

Proposition 3.2 (All or nothing characterization of extreme points). 
Let T = T(f,g) h denote the set of joint densities bounded by h G 
L 1 (R" x R") and with marginals f,g G L 1 (R n ). A density h G T 
is an extreme point of T if and only if h = ly/h for some Lebesgue 
measurable set W C R d x R°\ 

Proof. Recall h G T implies < h < h. If h is extremal, we claim these 
inequalities cannot both be strict on any subset U C R 2rf of positive 
volume. To show the contrapositive, suppose such a U existed. Then 
for some e > the set U e = {z G R M | e < h < h — e} would 
also have positive volume. Lemma 13.11 provides 5 > and V C U e 
of positive measure such that all four points (x ± | A, y ± | A) and 
(x ± |A,y =f | A) lie in U e whenever (x - f A,y - f A) G V. Setting 
/ = [0, 1] and using i, j G {0, 1} to define four coordinate hypercubes 
Hij = (2i — l)I d x (2j — l)I d , after translation we may also assume 

V C #oo \ dH 00 . Then 

{+1 if (x, y) G V or (x - 5 A, y - 5 A) G V 
-1 if (x - 5 A, y) G V or (x, y - 5 A) G V 
otherwise 

is well-defined. Notice that h is constructed using symmetries which 
ensure its integrals with respect to x and with respect to y both vanish, 
the other variable being held fixed. In other words, the marginals of 
h vanish. Also, h is supported in U e , where we have room to add or 
subtract e from h G (e, h — e). Thus h± := h±eh both belong to T; they 
are distinct since V has positive volume. Expressing h = \{h + + hJ) 
as a convex combination of h± shows h is not an extreme point of V. 

Conversely, we claim any h = ly/h with W C R M Lebesgue is 
extreme. To see this, suppose h = l\yh could be decomposed as a 
convex combination h = \{h + + hJ) of h± G T. Since h± are both non- 
negative, they must both vanish where h does; thus h± = outside of 
W. Since both h± < /i, they must both coincide with h where h does; 
thus h± — h in P^. This shows /i + = /i_, establishes extremality of 
h = Iwh, and completes the proof of the proposition. □ 

More importantly, it allows us to construct a perturbative argument 
for uniqueness, much simpler than the original proof of [KMllj . 
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Theorem 3.3 (Every optimizer is extreme). Let the cost c(x, y) satisfy 
conditions (CI) - (C3), fix < h G L x (R d x K d ) and take < f,g G 
L 1 (R d ) such that T := T(f,g) h =£ 0. If h E V is optimal, i.e. h G 
argmin^gpcf/c], t/ien /i zs an extreme point ofT. 

Proof. Suppose h G T is not an extreme point of T. We shall establish 
the theorem by constructing a perturbation of h which decreases the 
cost c[h]. Proposition 13.21 asserts h ^ lwh, meaning the set U of 
Lebesgue points z for h and h at which < h <h has positive volume. 
Similarly, for sufficiently small e > the set U e — {z G U \ Z \ e < h < 
h — e} also has positive volume, where Z is the negligible closed set on 
which hypotheses (C2) c G C 2 and (C3) &et[D 2 xiyj \ ^ may fail. Let 
z o — ( x o,yo) be a point where U e has full Lebesgue density; we may 
assume Zq to be the origin without loss of generality. After a linear 
transformation of the variable y (as in |MPWfO] or §5 of [KMllj ). 
we can also assume D 2 xi jc(z ) = —5ij without losing generality. Set 
/ = [0, 1] and Hij = (2i - j)I d x (2j - l)I d for i, j G {0, 1}. Applying 
Lemma 13.11 in the new coordinates yields 5 > and a set V C H 0Q of 
positive measure such that (x+i5A, y+j5A) G U t nHij for i, j G {0, 1}. 
The perturbation h$ = h of <^j is again well-defined, and its marginals 
vanish. Moreover, h$ = h + eh$ G T is a feasible competitor since 
I ^<5 1 < 1[/ E an d ^ G (e, h — e) on f/ e . The change in cost produced by 
this perturbation is 

c[h s ] - c[h] 
= e / c(x,y)hs(x,y)d d xd d y 



e / [c(x, y) + c(x + 5A,y + 5 A) - c(x + 5 A, y) - c(x, y + SA)]d d xd d y 
Jv 

e5 / t / / Yl D lw c ( x + s5A ' y + t5A)dsdt]d d xd d y. 
Jv Jo Jo ,■ ,,_, 



In this formula, the arguments z of the continuous mixed partials D 
all lie within distance 5\/2d of a point z at which J^ i ■ D 2 xi 



x l y3 



y j 



c(z 



—n. Thus for 5 small enough, the perturbed cost c[hs — h] < is 
negative, precluding optimality of /i. The contrapositive implies the 
only optimizers h of c are extreme points of T. □ 

Corollary 3.4 (Uniqueness of Optimizer). Under the same hypotheses, 
the minimum in Theorem VS.'A is uniquely attained. 
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Proof. If ho and hi both minimize c[h] on T, then so does hi/ 2 — 
7}(ho + hi) since c[ ■ ] is linear and T is convex. Theorem I3.3l then asserts 
extremality of hi/ 2 inT, so h = hi. This is the desired uniqueness. □ 

4. Simulations and symmetries 

In case h(x, y) = const, the problem has symmetries which limit the 
possible solutions. After introducing these symmetries, we use them 
to establish a new class of examples for which the optimal transport 
can be displayed explicitly. These include the two-by-two checkerboard 
(Example 1.1 of |KM11] or Corollary 14.31 below) as a particular case. 

Figure [TJ shows a simulation of the optimal solutions with uniform 
marginals for the distance squared cost on I x / with h = 3 and with 
h — |. Red represents the region W where the h constraint is saturated; 
in the complementary blue region, no transportation occurs. These 
computer simulations were originally presented to us by Brian Wetton 
who remarked on the symmetry manifested between the h = 3 case and 
the h = | case. This symmetry is explained by the following lemma, 
which applies to any pair of Holder conjugates p and q. 



[h = 3] 




[h 




Figure 1 . Red represents the saturation set W given h. 
Note the symmetry between (a) h — 3 and (b) h = 3/2. 



Lemma 4.1 (Symmetries). Let fl,Ac R d be bounded sets with unit 
volume. Set / = In and g = 1\, where In denotes the indicator func- 
tion of the set Q. Let h = h p := plnxA have constant density p > 1 on 
the product Q x A, and c(x,y) = —x ■ y. Given any set W C Q x A, 
let R(W) denote its image under the reflection R(x,y) = (x, —y), and 
W := (Q x A) \ W its set theoretic complement. If pl w e T hp (ln, 1a) 
then gljy G T^I^Ia), where p^ 1 + q^ 1 = 1 are Holder conjugates. 



Moreover c[l w ) + b(Q) ■ 6(A) = -c[l 



U'J 



c[l 



R(W)\ 



where b(Q) = J Q 



x 
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is the center of mass of Q. Thus ply/ minimizes c on r p (ln, 1a) if 
and only if ql R , w \ minimizes c on T qlR< - nxA '>(lf l , 1_a). 

Proof. For W C Q x A set W(x) = {y G K d \ (x_,y) G W} and 
W~ l (y) = {x G R d | (x,y) G W}. Notice pl w G T~ h *(l n , 1 A ) if and 
only if p|W(a;)| = 1 and p|W -1 (2/)| = 1 for a.e. (x,y) eOxA. From 
|W(a;)| + \W(x)\ = 1 we conclude |W"(a;)| = 1 - i = - q for a.e. x G fl, 

and similarly |VF~%)| = \- Thus gl^ G r^(l n , 1a). 
On the other hand, 

c[lwr] + c[%] = / c{x,y)+ lc(x,y) 
Jw Jw 

-x-y 



and 



-x-y 

In Jh 

-bin) ■ 6(A) 



C l l R(W)\ = ~ c ( x '^) 

JR(W) 



c(x,-y) 
w 



= - lc(x,y) 

Jw 

= ~ c \ l wi 
which imply the remaining assertions. □ 

The next proposition shows these elementary symmetries yield a 
broad class of examples in the self-dual case p = 2 = q. 

Proposition 4.2 (Universal optimizer for a balanced set with self-dual 
constraint). Fix uniform densities f = 1q and g = 1\ on two bounded 
Lebesgue sets Q, A C R d of unit volume. Let h — J12 '■— 2 • l^xA have 
constant density 2 on Q x A, and fix c(x,y) = —x ■ y. If Q and A are 
balanced, meaning Q = —Q, the minimizer h 2 lw °f c on F h2 (ln, 1a) 
satisfies W = R(W) = — R(W) (up to sets of measure zero), where 
R(x, y) = (x, —y) . It follows that W = {(x,y) eflxA | x ■ y > 0} . 

Proof. Note that T := T(1q, l\) h ^ since it contains loxA- Thus 
there exists h?\w minimizing c on V as in [K Mllj . Lemma \A . 1 1 ensures 
hz^-RfW) a ^ so minimizes c on T, as do h 2 l-w and hence h 2 l R /_y/\ = 
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^^-R(w)- The uniqueness established in Corollary 13 . 41 therefore implies 
W = —W = R(W) = —R(W), up to sets of measure zero. For each 
Lebesgue point zq = (xo,yo) G W of full density, this shows W also 
contains (— Xo, —yo) but not R(zq) = (xo, —yo) nor —R(zq) = (—xo,yo)- 
Therefore choose a Lebesgue point z = (xo,yo) G W of full density 
with xo 7^ 7^ yo and xo ■ yo ^ 0, noting almost all points in W have 
this form. For r > sufficiently small, the ball Z\ := Z = B r (zo) will 
be disjoint from its reflections Z3 = —Z, Z2 = R{Z), and Z4 = —R(Z), 
and moreover (x, y) e Z will imply x • ?/ has the same sign as xo • yo- We 
claim xo ■ yo > 0. Otherwise h = —lz 1 yjz 3 + lz 2 uz 4 would be a feasible 
perturbation, and c(x,y) + c(—x,—y) > > c(x, — y) + c(—x,y) for 
all (x,y) G Z shows /^W + h lowers the cost c in T(f,g) h2 . This 
contradicts the minimality of h,2lw- Thus, up to sets of measure zero, 
W must be contained in W := {(x, y) G Vt x A | x • y > 0}. On 
the other hand, the fact that Q and A are balanced makes it easy to 
check feasibility of h^Xw- Feasibility of W then shows the containment 
W C W cannot be strict, apart from a set of measure zero, so W = W 
as desired. □ 

As an immediate corollary, we recover Example 1.1 of [KMllj . dis- 
played in Figure 2. Note this analytical example (like the numerical 
ones preceding) dispels a number of natural conjectures about the opti- 
mizing set by demonstrating that its topology need not be simple and 
its boundary need not be smooth. Symmetry and self-duality gives 




Figure 2. The two- by- two checkerboard solves h = 2. 
a much more satisfactory explanation for its singular nature than the 
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original argument, which was based on guessing a solution to the linear 
program dual to (J2J). 

Corollary 4.3 (The 2x2 checkerboard revisited). Taking Q = A = 
[— |, 5] , the preceding proposition shows the minimizer hzlw °/ c [ ' ] on 
^(ln, l n ) to be given by W = [-§, 0] 2 U [0, \f . 

5. Afterword 

When transport capacity between x and y G R d is constrained by 
a density fa 6 L 1 (R d x R d ), the 'all or nothing' (a.k.a. bang-bang) 
characterization of extremal plans h e T h (f,g) makes optimal trans- 
port between / onto g appear easier to analyze than the unconstrained 
problem h = +00 |AKM11 J. Nevertheless, the simple examples with 
capacity constraints solved above using numerical or theoretical meth- 
ods display an unexpectedly rich range of phenomena and raise new 
questions of their own. Although not discussed here, the linear pro- 
gram dual to the capacity constrained problem [KMllj turns out to 
be more complicated to solve than that of the unconstrained problem 
h = +00 [RR98J [V09j . We hope to analyze this difficulty in future 
work. 
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